AITopics | macro avg 0

Collaborating Authors

macro avg 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models

Do, Nguyen, Nguyen, Truc, Hassanaly, Malik, Alharbi, Raed, Seo, Jung Taek, Thai, My T.

arXiv.org Machine LearningMar-8-2025

Despite a plethora of anomaly detection models developed over the years, their ability to generalize to unseen anomalies remains an issue, particularly in critical systems. This paper aims to address this challenge by introducing Swift Hydra, a new framework for training an anomaly detection method based on generative AI and reinforcement learning (RL). Through featuring an RL policy that operates on the latent variables of a generative model, the framework synthesizes novel and diverse anomaly samples that are capable of bypassing a detection model. These generated synthetic samples are, in turn, used to augment the detection model, further improving its ability to handle challenging anomalies. Swift Hydra also incorporates Mamba models structured as a Mixture of Experts (MoE) to enable scalable adaptation of the number of Mamba experts based on data complexity, effectively capturing diverse feature distributions without increasing the model's inference time. Empirical evaluations on ADBench benchmark demonstrate that Swift Hydra outperforms other state-of-the-art anomaly detection models while maintaining a relatively short inference time. From these results, our research highlights a new and auspicious paradigm of integrating RL and generative AI for advancing anomaly detection.

avg 0, macro avg 0, weighted avg 0, (14 more...)

arXiv.org Machine Learning

2503.06413

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > South Korea (0.04)
Asia > Middle East > Saudi Arabia (0.04)

Genre: Research Report (0.64)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area (0.46)
Government > Regional Government > North America Government > United States Government (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Comparative Study of Zero-Shot Cross-Lingual Transfer for Bodo POS and NER Tagging Using Gemini 2.0 Flash Thinking Experimental Model

Narzary, Sanjib, Brahma, Bihung, Mahilary, Haradip, Brahma, Mahananda, Som, Bidisha, Nandi, Sukumar

arXiv.org Artificial IntelligenceMar-6-2025

Part-of-Speech (POS) tagging and Named Entity Recognition (NER) are fundamental tasks within the field of Natural Language Processing (NLP), serving as essential prerequisites for a multitude of downstream applications. POS tagging, the process of assigning grammatical categories to individual words within a sentence (e.g., noun, verb, adjective, adverb), provides crucial syntactic information that underpins higher-level language understanding. NER, on the contrary, focuses on identifying and classifying named entities - real-world objects that are designated with a proper name - into predefined semantic categories such as persons, organizations, locations, dates, times, and quantities [1, 2]. The synergy of POS and NER tagging empowers a wide spectrum of NLP applications. In information extraction, NER helps to pinpoint key entities, while POS tags help to understand the relationships between these entities and other words in the text, facilitating the extraction of structured information from unstructured text [3]. Machine translation systems benefit from POS tagging to improve syntactic analysis and word order prediction, and NER to ensure accurate translation of named entities in languages [4]. Question-answer systems rely on both NER and POS to understand the question's intent, identify relevant entities and relationships in the knowledge base, and formulate accurate answers. Text summarization algorithms leverage NER to identify salient entities and POS tags to preserve grammatical coherence and readability in summaries.

computational linguistic, evaluation precision recall, gemini 2, (10 more...)

arXiv.org Artificial Intelligence

2503.04405

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Asia > Indonesia > Bali (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek

Pavlopoulos, John, Bakagianni, Juli, Pouli, Kanella, Gavriilidou, Maria

arXiv.org Artificial IntelligenceJan-22-2025

Natural Language Processing (NLP) for lesser-resourced languages faces persistent challenges, including limited datasets, inherited biases from high-resource languages, and the need for domain-specific solutions. This study addresses these gaps for Modern Greek through three key contributions. First, we evaluate the performance of open-source (Llama-70b) and closed-source (GPT-4o mini) large language models (LLMs) on seven core NLP tasks with dataset availability, revealing task-specific strengths, weaknesses, and parity in their performance. Second, we expand the scope of Greek NLP by reframing Authorship Attribution as a tool to assess potential data usage by LLMs in pre-training, with high 0-shot accuracy suggesting ethical implications for data provenance. Third, we showcase a legal NLP case study, where a Summarize, Translate, and Embed (STE) methodology outperforms the traditional TF-IDF approach for clustering \emph{long} legal texts. Together, these contributions provide a roadmap to advance NLP in lesser-resourced languages, bridging gaps in model evaluation, task innovation, and real-world impact.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.12826

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Dominican Republic (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

INSIGHTBUDDY-AI: Medication Extraction and Entity Linking using Large Language Models and Ensemble Learning

Romero, Pablo, Han, Lifeng, Nenadic, Goran

arXiv.org Artificial IntelligenceDec-27-2024

Medication Extraction and Mining play an important role in healthcare NLP research due to its practical applications in hospital settings, such as their mapping into standard clinical knowledge bases (SNOMED-CT, BNF, etc.). In this work, we investigate state-of-the-art LLMs in text mining tasks on medications and their related attributes such as dosage, route, strength, and adverse effects. In addition, we explore different ensemble learning methods (\textsc{Stack-Ensemble} and \textsc{Voting-Ensemble}) to augment the model performances from individual LLMs. Our ensemble learning result demonstrated better performances than individually fine-tuned base models BERT, RoBERTa, RoBERTa-L, BioBERT, BioClinicalBERT, BioMedRoBERTa, ClinicalBERT, and PubMedBERT across general and specific domains. Finally, we build up an entity linking function to map extracted medical terminologies into the SNOMED-CT codes and the British National Formulary (BNF) codes, which are further mapped to the Dictionary of Medicines and Devices (dm+d), and ICD. Our model's toolkit and desktop applications are publicly available (at \url{https://github.com/HECTA-UoM/ensemble-NER}).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.19467

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.05)
Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)

Add feedback

A Temporal Convolutional Network-based Approach for Network Intrusion Detection

Nazre, Rukmini, Budke, Rujuta, Oak, Omkar, Sawant, Suraj, Joshi, Amit

arXiv.org Artificial IntelligenceDec-23-2024

Network intrusion detection is critical for securing modern networks, yet the complexity of network traffic poses significant challenges to traditional methods. This study proposes a Temporal Convolutional Network(TCN) model featuring a residual block architecture with dilated convolutions to capture dependencies in network traffic data while ensuring training stability. The TCN's ability to process sequences in parallel enables faster, more accurate sequence modeling than Recurrent Neural Networks. Evaluated on the Edge-IIoTset dataset, which includes 15 classes with normal traffic and 14 cyberattack types, the proposed model achieved an accuracy of 96.72% and a loss of 0.0688, outperforming 1D CNN, CNN-LSTM, CNN-GRU, CNN-BiLSTM, and CNN-GRU-LSTM models. A class-wise classification report, encompassing metrics such as recall, precision, accuracy, and F1-score, demonstrated the TCN model's superior performance across varied attack categories, including Malware, Injection, and DDoS. These results underscore the model's potential in addressing the complexities of network intrusion detection effectively.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.17452

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.48)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Take Package as Language: Anomaly Detection Using Transformer

Huang, Jie

arXiv.org Artificial IntelligenceNov-14-2024

Network data packet anomaly detection faces numerous challenges, including exploring new anomaly supervision signals, researching weakly supervised anomaly detection, and improving model interpretability. This paper proposes NIDS-GPT, a GPT-based causal language model for network intrusion detection. Unlike previous work, NIDS-GPT innovatively treats each number in the packet as an independent "word" rather than packet fields, enabling a more fine-grained data representation. We adopt an improved GPT-2 model and design special tokenizers and embedding layers to better capture the structure and semantics of network data. NIDS-GPT has good scalability, supports unsupervised pre-training, and enhances model interpretability through attention weight visualization. Experiments on the CICIDS2017 and car-hacking datasets show that NIDS-GPT achieves 100\% accuracy under extreme imbalance conditions, far surpassing traditional methods; it also achieves over 90\% accuracy in one-shot learning. These results demonstrate NIDS-GPT's excellent performance and potential in handling complex network anomaly detection tasks, especially in data-imbalanced and resource-constrained scenarios. The code is available at \url{https://github.com/woshixiaobai2019/nids-gpt.gi

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.04473

Country: Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Networks (0.87)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic deductive coding in discourse analysis: an application of large language models in learning analytics

Zhang, Lishan, Wu, Han, Huang, Xiaoshan, Duan, Tengfei, Du, Hanxiang

arXiv.org Artificial IntelligenceOct-2-2024

Deductive coding is a common discourse analysis method widely used by learning science and learning analytics researchers for understanding teaching and learning interactions. It often requires researchers to manually label all discourses to be analyzed according to a theoretically guided coding scheme, which is time-consuming and labor-intensive. The emergence of large language models such as GPT has opened a new avenue for automatic deductive coding to overcome the limitations of traditional deductive coding. To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. By analyzing and comparing the accuracy and Kappa values of these three classification methods, we found that GPT with prompt engineering outperformed the other two methods on both datasets with limited number of training samples. By providing detailed prompt structures, the reported work demonstrated how large language models can be used in the implementation of automatic deductive coding.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.0124

Country:

North America > United States (0.14)
North America > Canada > Quebec > Montreal (0.14)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.46)

Industry:

Education > Educational Setting > Online (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.93)
Education > Assessment & Standards > Student Performance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Severity Prediction in Mental Health: LLM-based Creation, Analysis, Evaluation of a Novel Multilingual Dataset

Skianis, Konstantinos, Pavlopoulos, John, Doğruöz, A. Seza

arXiv.org Artificial IntelligenceSep-25-2024

Large Language Models (LLMs) are increasingly integrated into various medical fields, including mental health support systems. However, there is a gap in research regarding the effectiveness of LLMs in non-English mental health support applications. To address this problem, we present a novel multilingual adaptation of widely-used mental health datasets, translated from English into six languages (Greek, Turkish, French, Portuguese, German, and Finnish). This dataset enables a comprehensive evaluation of LLM performance in detecting mental health conditions and assessing their severity across multiple languages. By experimenting with GPT and Llama, we observe considerable variability in performance across languages, despite being evaluated on the same translated dataset. This inconsistency underscores the complexities inherent in multilingual mental health support, where language-specific nuances and mental health data coverage can affect the accuracy of the models. Through comprehensive error analysis, we emphasize the risks of relying exclusively on large language models (LLMs) in medical settings (e.g., their potential to contribute to misdiagnoses). Moreover, our proposed approach offers significant cost savings for multilingual tasks, presenting a major advantage for broad-scale implementation.

dataset, health condition, llm, (13 more...)

arXiv.org Artificial Intelligence

2409.17397

Country:

Europe > Spain > Aragón (0.04)
Europe > Greece > Epirus > Ioannina (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

GPT-4 Generated Narratives of Life Events using a Structured Narrative Prompt: A Validation Study

Lynch, Christopher J., Jensen, Erik, Munro, Madison H., Zamponi, Virginia, Martinez, Joseph, O'Brien, Kevin, Feldhaus, Brandon, Smith, Katherine, Reinhold, Ann Marie, Gore, Ross

arXiv.org Artificial IntelligenceFeb-8-2024

Large Language Models (LLMs) play a pivotal role in generating vast arrays of narratives, facilitating a systematic exploration of their effectiveness for communicating life events in narrative form. In this study, we employ a zero-shot structured narrative prompt to generate 24,000 narratives using OpenAI's GPT-4. From this dataset, we manually classify 2,880 narratives and evaluate their validity in conveying birth, death, hiring, and firing events. Remarkably, 87.43% of the narratives sufficiently convey the intention of the structured prompt. To automate the identification of valid and invalid narratives, we train and validate nine Machine Learning models on the classified datasets. Leveraging these models, we extend our analysis to predict the classifications of the remaining 21,120 narratives. All the ML models excelled at classifying valid narratives as valid, but experienced challenges at simultaneously classifying invalid narratives as invalid. Our findings not only advance the study of LLM capabilities, limitations, and validity but also offer practical insights for narrative generation and natural language processing applications.

classification, narrative, significance level, (15 more...)

arXiv.org Artificial Intelligence

2402.05435

Country:

North America > Puerto Rico > Peñuelas > Peñuelas (0.04)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Virginia > Suffolk (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Crosslingual Retrieval Augmented In-context Learning for Bangla

Li, Xiaoqian, Nie, Ercong, Liang, Sheng

arXiv.org Artificial IntelligenceDec-2-2023

The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multilingual pretrained language models (MPLMs), especially the generative model BLOOMZ, to successfully boost performance on Bangla tasks. Our extensive evaluation highlights that the cross-lingual retrieval augmented prompts bring steady improvements to MPLMs over the zero-shot performance.

avg 0, computational linguistic, weighted avg 0, (14 more...)

arXiv.org Artificial Intelligence

2311.00587

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(11 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback